[zlaski/memset-model] QL model for memset and friends#2027
Conversation
jbj
left a comment
There was a problem hiding this comment.
Be sure to import this new file from https://github.com/Semmle/ql/blob/0db648beaeba490ee29d00104d254c27c83f65ba/cpp/ql/src/semmle/code/cpp/models/Models.qll
| /** | ||
| * The standard function `memset` and its assorted variants | ||
| */ | ||
| class MemsetFunction extends ArrayFunction, DataFlowFunction, TaintFunction { |
There was a problem hiding this comment.
You can remove TaintFunction from the list of superclasses and remove Taint from the list of imports.
There was a problem hiding this comment.
Or add taint flow from the character and number parameters to the buffer??? An attacker with control of them has some (very limited) control of the data that ends up in the buffer and could plausibly use this for harm.
There was a problem hiding this comment.
I argued against that in #1933 (comment).
@zlaski-semmle and I just had a discussion about what constitutes taint. Here is the definition I wrote a long time ago: https://github.com/Semmle/ql/blob/e1594a4b0b9ff0cd3fe70e873ad38a98c0a6b41d/cpp/ql/src/semmle/code/cpp/dataflow/TaintTracking.qll#L5-L8. For other languages, like Java, it's completely different, and they haven't attempted to write down a definition. I think the only "definition" that fits all languages is: taint should flow where it's beneficial for some taint-tracking queries and harmful for none.
| /** | ||
| * The standard function `memset` and its assorted variants | ||
| */ | ||
| class MemsetFunction extends ArrayFunction, DataFlowFunction, TaintFunction { |
There was a problem hiding this comment.
I think AliasFunction should be extended as well in order to tell the escape analysis that memset doesn't retain a pointer for later. See https://github.com/Semmle/ql/blob/0db648beaeba490ee29d00104d254c27c83f65ba/cpp/ql/src/semmle/code/cpp/models/interfaces/Alias.qll#L43-L48
There was a problem hiding this comment.
Done. Let me know if I've over-specified things :-)
| hasGlobalName("memset") or | ||
| hasGlobalName("bzero") or | ||
| hasGlobalName("__builtin_memset") or | ||
| hasQualifiedName("std", "memset") |
|
LGTM. Most of our suggestions are actually requests for improvements, not fixes, so feel free to defer them as future work. |
|
|
||
| override predicate hasDataFlow(FunctionInput input, FunctionOutput output) { | ||
| input.isInParameter(0) and | ||
| output.isOutReturnValue() |
There was a problem hiding this comment.
I've just merged #1938, so these names are now deprecated.
|
This PR LGTM now. It needs
|
…emset.qll` to `Models.qll`.
…ction`; override predicates `parameterNeverEscapes`, `parameterEscapesOnlyViaReturn` and `parameterIsAlwaysReturned`.
536e967 to
a0cbd87
Compare
|
I believe this PR is now mergeable. I'd like to get it merged so that I may continue work on #1933. |
|
It looks like @geoffw0 has no objections, so I'll merge this. |
No description provided.